Improving Human-Object Interaction Detection via Phrase Learning and Label Composition
نویسندگان
چکیده
Human-Object Interaction (HOI) detection is a fundamental task in high-level human-centric scene understanding. We propose PhraseHOI, containing HOI branch and novel phrase branch, to leverage language prior improve relation expression. Specifically, the supervised by semantic embeddings, whose ground truths are automatically converted from original annotations without extra human efforts. Meanwhile, label composition method proposed deal with long-tailed problem HOI, which composites labels neighbors. Further, optimize loss composed of distilling balanced triplet proposed. Extensive experiments conducted prove effectiveness achieves significant improvement over baseline surpasses previous state-of-the-art methods on Full NonRare challenging HICO-DET benchmark.
منابع مشابه
Improving Phrase Extraction via MBR Phrase Scoring and Pruning
One of the major reasons for translation errors in phrase-based SMT systems is the incorrect phrases induced from inaccuracy word-aligned parallel data. In this paper, we propose a novel approach that uses the minimum Bayes-risk (MBR) principle to improve the accuracy of phrase extraction. Our approach performs as a four-stage pipeline: first, bilingual phrases are extracted from parallel corpu...
متن کاملImproving Classification by Improving Labelling: Introducing Probabilistic Multi-Label Object Interaction Recognition
This work deviates from easy-to-define class boundaries for object interactions. For the task of object interaction recognition, often captured using an egocentric view, we show that semantic ambiguities in verbs and recognising sub-interactions along with concurrent interactions result in legitimate class overlaps (Figure 1). We thus aim to model the mapping between observations and interactio...
متن کاملEvent Understanding of Human-Object Interaction: Object Movement Detection via Stable Changes
This chapter proposes an object movement detection method in household environments. The proposed method detects “object placement” and “object removal” via images captured by environment-embedded cameras. When object movement detection is performed in household environments, there are several difficulties: the method needs to detect object movements robustly even if sizes of objects are small,...
متن کاملImproving semistatic compression via phrase-based modeling
In the last years, new semistatic word-based byte-oriented text compressors, such as Tagged Huffman and those based on Dense Codes, have shown that it is possible to perform fast direct search over compressed text and decompression of arbitrary text passages over collections reduced to around 30-35% of their original size. Much of their success is due to the use of words as source symbols and a...
متن کاملLearning Composition Models for Phrase Embeddings
Lexical embeddings can serve as useful representations for words for a variety of NLP tasks, but learning embeddings for phrases can be challenging. While separate embeddings are learned for each word, this is infeasible for every phrase. We construct phrase embeddings by learning how to compose word embeddings using features that capture phrase structure and context. We propose efficient unsup...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2022
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v36i2.20041